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Abstract 

Current  methods  in  motor  control  have  problems  dealing  effectively  with  highly  variable  dynamic 
inertial  interactions  between  multijointed  robots  and  payloads.  We  are  developing  an  autonomous 
neural  network  controller  that  can  overcome  these  difficulties  by  learning  to  anticipate  the  inertial 
interactions  from  its  own  experience.  The  neural  network  controller  will  allow  robots  to  handle 
diverse  payloads  in  uncertain  environments  to  benefit  a  wide  variety  of  material  handling 
applications.  Our  target  application  is  bin-picking,  the  grasping  of  an  object  from  a  bin  containing 
many  randomly  oriented  objects  and  placing  it  at  a  desired  location. 

During  this  quarter  of  the  SBIR  Phase  II  contract,  we  focused  on  the  dynamic  control  aspect  of  the 
problem  by  extending  our  working  implementation  of  the  neural  network  controller  from  the  Phase 
I  effort.  Using  a  commercially  available  scara-type  robot,  we  demonstrated  a  functional  prototype 
of  the  neural  network  controller  for  realizing  point-to-point  control.  The  controller  design  consists 
of  dynamic  position  and  velocity  servos  in  parallel  with  an  adaptive  neural  network  controller  for 
each  joint.  The  controller  adaptively  learns  to  compensate  for  the  dynamic  inertial  interactions  with 
different  payloads  through  its  own  experience.  Using  two  joints  of  the  scara  robot,  the  controller 
achieved  a  position  accuracy  of  0.2%  of  the  joint  range,  a  timing  accuracy  of  within  5%  of  the 
requested  movement  time,  and  an  end-point  stability  of  within  8%  of  the  maximum  planned 
velocity.  This  performance  was  measured  on  both  joints  after  only  150  training  iterations  with  a 
movement  that  had  large  dynamic  coupling  forces  between  the  scara  links. 
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1.  Introduction 

For  typical  material  handling  applications  in  automation,  80%  of  the  cost  is  required  for 
customized  tooling  in  order  to  constrain  both  the  handling  process  and  part  presentation.  This 
tooling  is  necessary  to  maintain  very  low  tolerances  between  the  expected  material  path  and  the 
robots  handling  the  material  because  current  robot  control  technology  cannot  deal  effectively  with 
variations  in  the  workplace.  Controllers  which  allow  robots  to  be  flexible  and  adaptable  to  changes 
in  the  environment  would  significantly  reduce  the  tooling  cost. 

For  this  research  contract,  we  are  focusing  our  development  work  on  the  problem  of  bin-picking. 
In  this  application,  a  robot  must  pick  out  a  single  object  from  a  bin  containing  many  randomly 
oriented  objects  and  then  place  it  at  a  desired  location.  For  the  robot  to  be  flexible  and  adaptable  in 
doing  bin-picking,  it  must  be  able  to  pick  up  and  move  a  wide  variety  of  objects  with  different  sizes, 
shapes,  weights,  and  weight  distributions.  Then,  the  same  robot  could  be  used  for  most  bin-picking 
applications  without  requiring  any  customized  tooling  or  reprogramming. 


Bin-picking  can  be  divided  into  five  tasks: 

1 .  Dynamic  multijoint  control.  Moving  a  multijoint  robot  arm  with  or  without  an 
object  from  one  location  to  another  introduces  inertial  interactions  and  dynamic 
coupling  between  the  joints.  These  dynamic  effects  are  enhanced  if  the  weight 
of  the  object  is  comparable  to  or  larger  than  the  weight  of  the  arm. 

2.  Object  singulation.  Given  a  bin  of  randomly  oriented  and  overlapping  objects, 
the  controller  must  be  able  to  single  out  one  of  the  objects  and  determine  its  ori¬ 
entation  and  location  in  three-dimensional  space. 

3.  Visually-guided  reaching.  Assuming  the  location  and  orientation  of  the  object 
has  been  computed,  the  controller  must  be  able  to  move  the  gripper  on  the  arm 
from  its  cuiTent  position  to  the  object  with  the  correct  orientation  in  three- 
dimensional  physical  space. 

4.  Sensor  fusion  of  vision,  tactile  and  force-feedback  for  gripping.  Once  the 
gripper  has  been  positioned  very  close  to  the  object,  these  sensor  modalities 
must  be  analyzed  in  conjunction  with  each  other  in  order  to  fine-tune  the  pro¬ 
cess  of  gripping  the  object. 

5.  Holding  and  releasing  the  object.  After  the  object  has  been  gripped,  the  con¬ 
troller  may  need  to  adjust  the  grip  in  order  to  prevent  the  object  either  from  slip¬ 
ping  out  during  a  fast  movement  or  from  being  damaged  by  the  gripper.  In 
addition,  the  controller  also  must  sense  the  actual  orientation  of  the  object  in  the 
gripper.  Different  orientations  most  likely  will  require  different  end  points  for 
the  movement  in  order  for  the  object  to  be  placed  at  its  proper  location  with  the 
correct  orientation. 

In  this  organization,  each  task  presumes  the  previous  tasks  have  been  completed  and  are  available. 


During  this  quarter,  we  concentrated  on  the  first  task,  dynamic  multijoint  control.  The  goal  of  this 
task  is  to  generate  fast,  accurate,  and  stable  movements.  However,  the  exact  trajectory  taken  during 
the  movement  is  not  critical.  Thus,  we  are  concentrating  on  a  point-to-point  control  strategy. 

Current  control  methods  in  industry  rely  mainly  on  proportional-integral-differential  (PID)  control. 
Although  simple,  PID  controllers  have  several  shortcomings  that  prevent  them  from  being  useful 
for  general  material  handling  applications.  First,  they  must  be  tuned  manually  for  any  given  type 
of  robot.  This  tuning  results  in  a  set  of  three  parameters  known  as  the  PID  coefficients.  These 
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coefficients  are  used  in  the  PID  formulation  to  control  the  robot  over  its  entire  range  of  movement, 
speed,  and  payload.  However,  the  dynamic  forces  on  the  robot  arm  vary  with  position  and  payload. 
Therefore,  the  PID  coefficients  must  be  selected  with  the  most  common  movement  range  and 
payload  in  mind.  As  a  result,  the  performance  of  the  robot  over  its  full  range  of  operation  can  be 
reduced  significantly.  In  material  handling  applications  involving  many  robots,  one  slow  robot  can 
create  a  bottleneck  for  the  entire  automation  line. 

In  the  regions  of  the  robot  range  where  the  PED  coefficients  are  not  well  tuned,  the  robot  arm  can 
exhibit  several  characteristics  that  further  degrade  the  performance  of  the  movements.  With  a 
point-to-point  control  strategy,  the  main  problem  in  realizing  fast,  accurate,  and  stable  movements 
is  end-point  oscillations.  When  a  robot  is  carrying  large  payloads  relative  to  its  maximum  payload, 
a  fast  movement  implies  the  robotic  arm  will  develop  a  relatively  large  amount  of  momentum  due 

to  the  inertial  qualities  of  the  payload. 1  This  momentum  can  cause  the  arm  to  overshoot  the  desired 
position  and  possibly  oscillate  about  it.  A  stable  movement  is  defined  to  be  one  in  which  the  end 
effector  of  the  robot  does  not  overshoot  nor  oscillate  about  the  desired  final  position  in  physical 
space.  Similarly,  an  accurate  movement  is  defined  to  be  one  in  which  the  end  effector  stops  within 
an  acceptable  tolerance  of  the  desired  final  position  at  the  right  time. 

The  goal  for  this  quarter  was  to  extend  our  dynamic  control  solution  for  obtaining  fast,  accurate, 
and  stable  movements  with  a  single  joint  robot  to  include  multiple  joints.  The  main  issue  to  be 
resolved  here  is  the  dynamic  coupling  between  the  joints.  For  any  particular  joint,  the  dynamics  of 
how  that  joint  moves  depends  on  where  the  other  joints  are  and  what  they  are  doing.  For  N  joints, 
this  problem  is  iV-dimensional  in  complexity.  The  solution  we  seek  must  be  as  simple  as  possible, 
rely  on  a  minimum  number  of  adaptable  parameters  and  fixed  constants,  be  insensitive  to  noise  and 
faults,  operate  autonomously  (hands-off),  and  generalize  to  any  robot  with  any  number  of  joints 
and  degrees  of  freedom.  Furthermore,  the  solution  must  be  robust  in  that  the  adaptable  parameters 
are  guaranteed  to  converge  to  values  which  always  satisfy  the  desired  accuracy  and  stability 
conditions  for  the  movement. 

The  robot  used  to  develop  the  dynamic  multijoint  control  formulation  is  a  commercially  available 
American  Robot  model  S-1400  and  is  depicted  in  Figure  1.  It  is  a  four-axis  scara-type  robot  with 
a  maximum  payload  of  50  pounds.  Only  two  axes  were  used  for  our  initial  demonstrations,  the 
shoulder  joint  (Jl)  and  the  elbow  joint  (J2).  The  links  associated  with  these  joints  operate  in  a 
horizontal  plane.  The  other  two  joints  were  not  powered.  The  shoulder  joint  is  capable  of  moving 
at  speeds  of  approximately  8  feet  per  second  and  has  a  range  of  300  degrees.  Furthermore,  it 
contains  a  high-resolution  optical  encoder  which  provides  258,000  counts  over  the  range  for 
accurate  position  and  velocity  information.  A  386  PC  computer  communicates  with  the  robot  and 
its  drive  electronics  via  a  commercial  data  acquisition  board  (Data  Translation  model  DT2812-A) 
and  an  encoder  board  (Technology  80  model  5638). 

Our  dynamic  multijoint  controller  is  a  biologically  inspired  control  system  with  adaptable 
parameters  to  govern  the  dynamics.  The  operation  of  the  controller  is  feedforward  at  a  high  level 
and  feedback  at  a  low  level.  The  high-level  feedforward  aspect  is  due  to  the  fact  that  the  governing 
parameters  are  computed  before  each  movement.  During  a  movement,  these  parameters  are  fixed 
and  local  position  and  velocity  feedback  is  used  to  allow  the  movement  dynamics  to  emerge 
naturally.  At  the  end  of  the  movement,  the  actual  timing,  accuracy,  and  stability  are  measured  to 
form  an  error  criterion  for  adapting  all  the  governing  parameters.  The  accuracy  error  is  a  position 
and  time  measurement  and  the  stability  error  is  a  velocity  measurement. 


1.  Even  with  no  payload,  a  robotic  arm  can  exhibit  inertial  effects  when  it  is  relatively  heavy. 
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The  dynamic  multijoint  controller  is  composed  of  identical  local  controllers  for  each  joint  that 
receive  global  position  inputs  from  all  joints.  Each  joint  controller  is  responsible  for  moving  its 
corresponding  joint  without  any  knowledge  of  the  control  signals  being  sent  to  the  other  joints.  The 
inputs  to  each  joint  controller  are  the  desired  movement  time,  the  initial  and  desired  final  positions 
for  all  joints,  and  a  subjective  measure  of  the  current  payload.  The  neural  networks  used  in  each 
joint  controller  to  store  the  adaptable  parameters  are  structured  to  reduce  considerably  complexity 
of  the  dynamic  coupling  problem  without  any  loss  of  generality.  The  dynamic  multijoint  controller 
is  described  in  more  detail  in  Section  2. 

The  approach  of  measuring  the  error  information  once  for  each  movement  is  in  contrast  to  current 
adaptive  methods  for  controlling  robots.  These  methods  are  much  more  complicated  than  the 
prevalent  PID  controllers.  They  involve  one  or  more  optimal  criteria  and  need  to  update  the 
governing  parameters  during  the  movement  to  achieve  the  criteria.  As  a  result,  these  methods 
require  much  more  computation  power  in  order  to  sample  the  state  of  the  robot’s  joints  quickly.  In 
contrast,  our  approach  is  significantly  less  demanding  computationally  during  the  movement. 
Instead,  our  high-level  feedforward  solution  performs  most  of  the  computation  before  the 
movement  is  started.  The  low-level  feedback  components  (the  local  joint  controllers)  involve  only 
simple  feedback  and  calculations  at  the  joint. 

In  addition,  the  optimal  adaptive  control  formulations  are  global  methods  in  that  the  next  motor 
signal  a  joint  receives  during  a  movement  depends  upon  the  current  states  of  all  joints.  Our  joint 
controllers  are  local  and  do  not  depend  upon  the  other  joints  during  a  movement.  A  joint  controller 
only  uses  the  initial  and  final  positions  of  all  joints  before  a  movement  to  calculate  the  current  set 
of  governing  parameters. 

Our  experimental  results  with  the  dynamic  multijoint  controller  are  discussed  in  Section  3.  We 
implemented  a  simplified  version  of  the  controller  which  did  not  take  into  account  variable 
payloads  at  this  time  because  we  did  not  have  any  way  to  subjectively  measure  them  during  this 
quarter.  The  results  show  that  the  controller  learns  to  make  accurate  and  stable  movements  very 
quickly.  The  adaptable  governing  parameters  converge  onto  their  optimal  values,  albeit  more 


Figure  1:  Schematic  of  the  scara  robot  and  its  associated  control  hard¬ 
ware  used  to  develop  the  dynamic  multijoint  control  formulation  of 
our  neural  network  controller. 
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slowly  than  the  movement  performance  would  suggest.  This  characteristic  indicates  the  robustness 
of  the  controller  design. 

2.  Dynamic  Multijoint  Control  Module 

For  a  robot  with  N  joints,  the  dynamic  multijoint  control  module  in  the  neural  network  controller 
is  composed  of  N  local  joint  controllers,  one  of  which  is  shown  in  Figure  2.  Each  local  joint 
controller  operates  in  two  modes,  posture  and  movement.  The  role  of  the  posture  mode  is  to  keep 
the  arm  at  its  desired  position,  irrespective  of  gravity  and  payload.  If  a  new  desired  position  is  set, 
the  posture  mode  will  move  the  arm  there  with  accuracy  but  not  on  time  or  with  stability. 
Movement  mode  is  responsible  for  establishing  the  on-time  arrival  and  the  end-point  stability. 
These  modes  operate  in  parallel  (additively)  with  the  movement  mode  being  activated  when  the 
desired  position  changes. 

The  local  joint  controller  has  four  inputs: 

1 .  The  initial  position  of  each  joint  in  the  arm  in  joint  space  (as  opposed  to  physical 
space).  These  angles  will  be  denoted  by  xiQ. 

2.  The  desired  position  for  each  joint  in  joint  space,  denoted  by  xld. 

3.  The  desired  movement  time,  Td. 

4.  A  measure  of  the  current  payload,  PL.  This  measure  can  be  subjective  and  only 
needs  to  be  repeatable  and  monotonic  with  the  payload’s  actual  weight. 

The  first  two  inputs  are  used  directly  by  the  local  joint  controller  and  the  second  two  inputs  are  used 


Figure  2:  Block  diagram  of  a  local  joint  controller.  For  a  robot  with  N 
joints,  the  complete  dynamic  multijoint  control  module  of  the  neural 
network  controller  consists  of  N  distinct  local  joint  controllers.  The 
adaptable  parameters  are  enclosed  by  horizontal  ovals. 
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only  to  compute  parameter  values. 

To  perform  a  movement,  an  internal  estimate  of  the  position  and  velocity  is  generated  for  the  1th 
joint  according  to 

fjfiiO  =  Tix  sgn  (xid  -xi0)  u(xt,  t),  ( 1 ) 


m  = 


U,(r)  -  rl0]  [xid-xt{t)]  } 
6  (xid  ~  Xio)  2 


U(X:,t), 


(2) 


where  xjt)  is  the  position  estimate,  v,(r)  is  the  velocity  estimate,  and  is  the  position  integration 
rate.  The  timing  of  the  movement  is  governed  by  Tix,  which  in  turn  depends  on  the  desired  move¬ 
ment  time.  The  function  u(xt,  t )  is  the  local  movement  signal  defined  by 


1  for  xi0<£t(t)<xid  if  (xid>xl0), 
u(x ;,  0  =  |  1  for  xid  <  i,(0  <  xi0  if  (xid  <  xi0 ) ,  (3) 

®  otherwise. 

This  function  is  1  when  the  movement  mode  parameters  are  to  be  active  and  0  when  only  the  pos¬ 
ture  parameters  are  in  control. 

The  true  velocity  estimate  ~Xj(t)  is  simply  a  constant  value  over  some  period  of  time.  However, 

this  form  does  not  acknowledge  the  inertial  properties  of  the  arm  because  it  presumes  a  step  change 
in  velocity  is  possible.  The  form  for  f>((r)  given  above  is  parabolic  and  has  a  symmetrical  bell¬ 
shaped  profile.  This  form  is  more  realistic  because  it  only  presumes  a  step  change  in  acceleration 
is  possible.  The  scale  factor  is  needed  in  the  parabolic  so  that  the  velocity  estimates  integrates 
over  time  to  be  xid-xiQ,  the  net  change  in  position. 


The  position  estimate  is  then  used  as  the  reference  for  a  position  servo,  described  by 

\  Pos  W  =  Pia  W) "  *,(')]  +  P $  (4) 

where  x^t)  is  the  current  position  of  the  joint,  Pia  is  the  position  servo  gain,  and  is  a  constant 

bias  to  compensate  for  external  forces  such  as  gravity.  This  torque  term  constitutes  the  posture 
mode. 

The  movement  mode  is  composed  of  a  velocity  servo  and  an  open-loop  P  controller.  The  velocity 
servo  is  driven  by  the  velocity  estimate  according  to 

X.-VeiW  =  ^[ViW-V.W],  (5) 

where  v-(r)  is  the  current  velocity  of  the  joint  and  Via  is  the  velocity  servo  gain.  While  the  position 

and  velocity  servos  operate  relative  to  the  estimated  position  and  velocity,  respectively,  the  open- 
loop  P  controller  is  a  simple  spring-like  restoring  force  that  is  relative  to  an  absolute  reference,  the 
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desired  final  position  xid : 

Xi,MovW  =  Mia  (6) 

where  Mia  is  the  gain  (analogous  to  the  stiffness  of  the  spring)  and  Af,p  is  a  bias  offset  (the  rest 

position  of  the  spring  relative  to  the  scaled  desired  final  position).  This  control  formulation  is  called 
“open-loop”  because  the  reference  position  is  fixed  during  the  movement. 

The  total  torque  signal  that  is  sent  to  the  motor  amplifier,  xfj) ,  is  the  sum  of  all  the  individual 
torques  with  the  movement  mode  terms  being  gated  by  the  movement  signal  function: 

=  Xi,PosW+  [^VelW  +  ^'MovW]^-  ')•  O) 

The  role  of  each  term  has  physical  significance.  The  position  servo  maintains  the  final  position 
accuracy  but  not  the  timing  nor  the  stability.  When  the  movement  mode  terms  are  active,  the  open- 
loop  P  controller  provides  on-time  arrival  and  stability.  The  velocity  servo  provides  coordination 
between  the  joints  to  compensate  for  the  dynamic  coupling  forces  to  increase  the  stability  of  the 
coupled  joints.  It  is  one  way  to  deal  with  these  forces,  but  not  the  only  method. 

At  the  end  of  the  movement  phase  (signaled  by  u(xt,  t)  going  from  1  to  0),  the  performance  of  the 

movement  is  observed  by  each  local  joint  controller.  Three  measurements  currently  are  available: 

1.  The  time  when  the  movement  phase  ended,  denoted  by  Ti. 

2.  The  position  of  the  joint,  xim. 

3.  The  velocity  of  the  joint,  vim. 

If  the  movement  had  some  instabilities,  the  arm  may  still  be  in  motion  for  some  small  amount  of 
time  after  the  movement  phase  ended.  When  the  arm  finally  stops  moving  (under  the  sole  influence 
of  the  posture  mode  position  servo),  a  fourth  measurement  is  available:  the  final  position  of  the 
joint,  xLf.  These  measurements  translate  into  the  following  error  quantities  for  each  local  joint  con¬ 
troller: 

1.  The  arrival  time  error,  AT i  =  Td  -  Ti. 

2.  The  movement  position  error,  Axim  =  xid~xim- 

3.  The  movement  velocity  error,  Avim  =  vid  -  vim  which  equals  -vim  since 

vid  =  °- 

4.  The  posture  position  error,  Ax^  =  xid  -  x^. 

These  errors  can  be  used  to  adapt  the  parameters  of  the  local  joint  controller. 

In  the  most  general  form,  the  adaptable  parameters  are  T u,  Pia,  P^,  Via,  Mia,  and 

Although  several  error  functions  are  possible,  the  simplest  ones  that  provided  the  fastest 
convergence  are: 

5r  =  A7’j,  (8) 

1  IX 

A*f/  |4xg>«Pi  (9) 

io  0  otherwise, 


2.  If  desired,  the  scaling  factor  for  the  velocity  estimate  v,-(f)  could  be  made  adaptable  in  the  manner  of  T  ix. 
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5P,  =  top 

(10) 

M ia  - 

(11) 

MiS  =  A*i». 

(12) 

where  is  the  error  tolerance  on  the  final  position  accuracy.  During  our  experiments,  Via  was  held 
constant  for  simplicity  so  no  error  functions  were  explored  yet. 

Each  adaptable  parameter  is  encoded  by  a  neural  network.  These  error  values  are  used  to  update 
the  weights  in  the  respective  networks.  The  basic  network  model  used  by  all  parameters  is  a 
topographical  map  that  is  excited  by  a  fixed  bell- shaped  activation  function  centered  at  the  inputs 
to  the  map.  The  output  of  the  map  is  the  sum  of  the  weighted  outputs  of  the  map.  This  structure  is 
best  illustrated  using  an  example. 


The  position  estimate  integration  rate  (Tix)  depends  only  on  the  desired  movement  time*(Td)  and 
the  initial  (xi0)  and  final  (xid)  positions  of  the  movement  Since  these  three  inputs  are  independent, 
the  corresponding  map  for  Tix  needs  to  have  three  dimensions.  Let  W  represent  the  map  so  Wjk[ 
denotes  the  neural  weight  at  Jti0  =  j,  xid  =  k,  and  T d  =  /.  Furthermore,  let  the  bell-shaped 
activation  function  be  the  three-dimensional  Gaussian  distribution. 


g(j,  k,  l)  =  C  exp 


U  ~ xip)  2+  (k- xid) 2  +  (l-Td)2 
2  a1 


(13) 


where  C  is  a  normalization  constant  and  a  is  the  half-width.  The  map  output  is  computed  using 

T<*  =  XXX <I4> 

J  *  t 

Given  the  error  signal  5r  ,  the  map  is  adapted  using 

1  ix 

AWjU  =  r\STixg(j,  k,  /)  (15) 

where  AWjkl  is  the  change  in  the  weight  at  index  position  (J,  k,  l )  and  r\  is  the  learning  rate. 

The  maps  for  the  other  parameters  are  slightly  more  complex  in  that  there  are  more  inputs.  For 
example,  the  posture  parameters  Pia  and  P^  depend  upon  the  current  payload  PL  and  the  final 

positions  of  all  joints.  The  movement  parameters  Mia  and  depend  upon  the  movement  time, 

the  current  payload,  and  the  initial  and  final  positions  of  all  joints.  In  general,  the  maps  for 
parameters  like  Mia  and  could  ( N  +  2  )-dimensional.  However,  it  is  not  necessary  to  relate  the 

initial  position  of  joint  i  (xi0)  with  the  final  position  of  joint  n  ( xnd ).  It  is  important,  though,  to 
relate  the  movement  of  joint  i  (jc(0,  xid)  with  the  movement  of  joint  n  ( xn0 ,  xnd).  Thus,  the  {N  +  2)- 
dimensional  maps  can  be  reduced  to  a  set  of  N  two-dimensional  maps  that  relate  xi0  and  xid  for 
each  joint  and  then  have  a  separate  two-dimensional  map  for  the  time  and  payload  inputs.  The 
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output  of  all  these  maps  can  be  summed  together  to  form  the  desired  parameter  such  as  Mia.  The 
error  value  8W  is  then  applied  to  all  the  component  maps  in  the  normal  way. 


3.  Performance  Results  and  Discussion 


Since  the  shoulder  and  elbow  joints  of  our  scara  robot  operate  in  a  plane,  there  are  no  external 
forces  such  as  gravity  acting  on  the  arm.  Therefore,  is  fixed  at  zero  for  simplicity.  Furthermore, 

the  position  and  velocity  servo  gains  can  be  kept  constant.  However,  because  both  joints  have 
significant  friction  and  drag,  the  position  servo  gain  Pia  was  changed  from  a  constant  to  the 

deterministic  function  Pia(x,(r)  —  x,(r)),  where 


**,■«« 


10-  160|jc| 
*  2 


for  0  <  \x\  <  0.05, 
for  |jt|  >  0.05. 


(16) 


The  lower  gain  away  from  the  desired  position  prevented  oscillatory  instabilities  from  appearing 
during  the  movement  (which  also  survive  at  the  end  of  the  movement).  Conversely,  the  higher  gain 
near  the  desired  position  produced  good  position  accuracy  in  the  presence  of  the  friction  and  drag. 
If  Pia(x)  was  to  become  adaptable,  only  the  peak  value  at  Pia{ 0)  needs  to  be  adapted. 


The  velocity  servo  gain  was  fixed  at  Via  =  0.5 .  With  Pia,  P&,  and  Via  either  fixed  or  functional, 

the  adaptable  parameters  were  Tix ,  Mia,  and  A/, p .  These  parameters  are  the  key  ones  for  governing 

the  timing  and  the  stability  of  the  movement.  The  learning  rates  for  all  their  error  values  were  set 
to  0.5. 

The  arm  was  trained  to  going  between  two  locations  for  300  iterations  (150  iterations  per 
movement).  The  results  from  one  of  these  movements  are  shown  in  Figures  3-5.  The  results  from 
the  other  movements  are  similar.  These  two  positions  were  chosen  because  they  are  representative 
of  the  cases  when  the  dynamic  coupling  forces  between  the  links  of  the  arm  are  the  strongest.  The 
robot  was  trained  successfully  to  move  between  multiple  points  to  demonstrate  the  matrix  idea  of 
maps  for  the  movement  parameters.  For  these  movements,  the  desired  movement  time  was  set  to 
Td  =  1  second  and  the  payload  was  fixed  at  PL  =  15  pounds. 

The  actual  movement  in  physical  space  is  illustrated  in  Figure  3.  The  path  shown  here  is  the  one 
taken  by  the  arm  after  training  was  stopped.  Note  that  the  path  is  smooth  but  not  linear  in  physical 
space  (and  neither  is  it  linear  in  joint  space).  This  is  quite  satisfactory  for  the  point-to-point  control 
strategy.  The  dynamic  coupling  forces  are  large  in  this  case  because  the  shoulder  joint  actually 
helps  to  over-accelerate  the  elbow  joint  at  the  beginning  of  the  movement.  When  the  shoulder  starts 
to  decelerate,  it  over-decelerates  the  elbow  joint.  For  a  smooth  movement,  the  elbow  joint  must  do 
some  braking  at  the  beginning  of  the  movement  and  some  accelerating  towards  the  end.  If  not,  the 
elbow  joint  will  burst  too  greatly  which  induces  oscillations  in  the  elbow’s  trajectory.  While  the 
path  is  not  of  concern,  these  oscillations  cannot  be  compensated  for  simply  by  the  open-loop  P 
controller.  The  velocity  servo  loop  provides  the  necessary  compensation. 

The  temporal  evolutions  of  the  trajectory  for  both  joints  is  shown  in  Figure  4(a).  All  joint  positions, 
velocities,  and  torques  have  been  normalized  to  their  respective  maximum  ranges.  The  torque 
profile  for  the  shoulder  joint  reveals  the  intuitive  situation  w  hereby  there  is  an  initial  burst  of 
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Figure  3:  The  trajectory  of  the  sample  arm  movement  in  physical 
space  after  training.  The  center  of  the  robot  is  at  (0,0). 

positive  torque  to  get  the  shoulder  moving  and  then  a  gradual  decrease  to  some  negative  torque  to 
provide  the  necessary  braking.  The  effect  of  the  open-loop  P  controller  is  to  skew  the  velocity 
profile  slightly  from  its  symmetrical  velocity  estimate,  producing  a  shorter  burst  and  a  longer 
brake.  On  the  elbow  joint,  the  effect  of  the  coupling  can  be  seen  by  the  significant  difference 
between  the  estimated  and  actual  velocity  curves.  The  initial  velocity  burst  is  due  to  the  coupled 
acceleration  of  the  shoulder  joint.  Similarly,  the  accelerating  torque  towards  the  end  of  the 
movement  is  compensating  for  the  coupled  braking  force  from  the  shoulder. 

The  end  points  at  t  =  1  second  show  no  noticeable  position  error  but  do  have  some  finite  velocity 
error.  The  training  evolution  of  these  errors  for  both  joints  is  shown  in  Figure  4(b).  The  position 
error  quickly  goes  to  zero  and  averages  within  0.2%  of  the  total  movement  distance  for  both  joints. 
The  velocity  error,  on  the  other  hand,  converges  more  slowly.  It  comes  within  4%  of  the  maximum 
estimated  velocity  for  the  shoulder  joint  and  within  8%  for  the  elbow  joint.  In  order  to  increase  the 
convergence  speed,  a  larger  learning  rate  could  be  used  with  the  velocity  error.  In  addition,  a 
velocity  error  threshold  could  be  introduced  that  sets  the  maximum  tolerable  velocity  error,  below 
which  a  zero  error  value  would  be  generated. 

The  training  evolution  of  the  adaptable  parameters  for  both  joints  is  shown  in  Figure  4(c).  Because 
the  velocity  error  had  not  converged  to  zero  yet,  the  parameters  have  not  converged  completely,  but 
they  are  visually  approaching  asymptotes. 

The  evolution  of  the  total  training  error  is  shown  in  Figure  5.  This  quantity,  denoted  by  Er  is 

defined  to  be  the  sum  of  all  the  error  values  over  all  adaptable  parameters  and  all  joints.  It  is  a 
measure  of  how  quickly  a  particular  movement  has  been  learned  from  an  external  performance 
viewpoint  (rather  than  an  internal  parameter  viewpoint).  For  the  sample  movement. 
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Figure  4:  (a)  The  time  evolution  of  the  sample  movement  trajectory  for 
both  joints,  (b)  The  training  evolution  of  the  end-of-movement  posi¬ 
tion  and  velocity  errors  for  both  joints,  (c)  The  training  evolution  of  the 
adaptable  parameters. 
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Figure  5:  The  evolution  of  the  total  training  error. 
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Note  that  even  though  the  parameters  have  not  converged  fully  yet,  the  total  training  error  decreases 
rapidly  initially  and  then  gradually  thereafter.  After  the  initial  decrease,  the  movement  visually 
appears  to  be  quite  acceptable  at  the  end  point.  During  the  remaining  gradual  error  decrease,  the 
parameters  are  being  adjusted  to  their  final  values  by  fine-tuning  the  movement’s  performance.  In 
practical  terms,  the  final  velocity  goes  to  zero  quickly  after  1  second.  The  actual  movement  takes 
about  1.05  seconds,  resulting  in  an  effective  timing  error  of  about  5%. 

These  plots  demonstrate  our  initial  success  with  the  dynamic  multijoint  control  module.  However, 
several  compromises  were  made  to  simplify  the  model.  First,  the  velocity  servo  gain  was  a 
constant.  Ideally,  it  should  be  adaptable  but  no  suitable  error  function  in  terms  of  the  end-point  error 
measurements  could  be  found.  Since  the  velocity  servo  compensates  for  extreme  velocities  during 
the  movement,  it  effectively  is  modifying  the  path.  Since  we  have  no  direct  path  error  information, 
the  error  function  could  be  some  integral  measure  of  a  path  quantity  that  is  observed  only  at  the  end 
point. 

Another  compromise  was  that  a  fixed  payload  was  used.  Since  no  force  sensors  were  available  on 
the  robot  at  the  end  effector,  no  measurement  of  the  payload  was  possible.  Similarly,  a  single 
movement  time  was  considered  for  each  movement  (although  it  could  vary  between  different 
movements).  However,  our  goal  is  to  handle  any  payload  and  any  physically  realizable  movement 
time. 

To  realize  these  capabilities  with  the  dynamic  multijoint  control  formulation,  a  two-pass  training 
approach  can  be  used  and  the  adaptable  parameters  can  have  additional  maps  with  PL  and  Td  as 

their  inputs.  The  output  of  these  maps  can  be  either  additive  or  multiplicative  with  the  total  output 
from  the  spatially-indexed  maps  (those  whose  inputs  are  the  initial  and  final  joint  positions). 
During  the  first  pass,  both  the  payload  and  the  movement  time  are  fixed.  The  robot  is  trained  to 
move  between  any  number  of  locations  and  the  spatially-irdexed  maps  are  updated  with  the 
appropriate  error  values.  Then,  a  different  payload  and/or  movement  time  is  chosen  and  the  training 
is  repeated  between  the  same  positions.  However,  the  spatially-indexed  maps  are  held  constant  this 
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time  and  only  the  maps  indexed  by  payload  and  movement  time  are  updated  with  the  corresponding 
error  values. 

4.  Future  Work 

Given  our  initial  success  with  the  dynamic  multijoint  control  module  of  the  autonomous  neural 
network  controller,  we  plan  to  begin  the  second  phase  of  the  bin-picking  problem  (visually-guided 
reaching).  This  process  involves  the  following  steps  for  the  next  quarter: 

1.  Extending  the  dynamic  multijoint  control  module  to  handle  six  degrees  of  free¬ 
dom  and  demonstrating  dynamic  control  again.  This  step  provides  continuity 
between  robots  and  allows  us  to  investigate  error  functions  for  making  the 
velocity  servo  gain  adaptable.  In  addition,  the  revolute  arm  will  be  affected  by 
gravity,  requiring  the  posture  bias  (P-r)  to  be  adapted.  It  also  allows  us  to  add 
variable  payloads  and  timings  to  the  dynamic  control  module. 

2.  Integrating  the  concepts  associated  with  reaching  in  three  dimensions  as  learned 
by  the  INFANT  model  [Kuperstein  1991]. 
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